FASTUS: A Finite-state Processor for Information Extraction from Real-world Text

نویسندگان

  • Douglas E. Appelt
  • Jerry R. Hobbs
  • John Bear
  • David J. Israel
  • Mabry Tyson
چکیده

Approaches to text processing that rely on parsing the text with a context-free grammar tend to be slow and error-prone because of the massive ambiguity of long sentences. In contrast, FASTUS employs a nondeterministic finite-state language model that produces a phrasal decomposition of a sentence into noun groups, verb groups and particles. Another finite-state machine recognizes domain-specific phrases based on combinations of the heads of the constituents found in the first pass. FASTUS has been evaluated on several blind tests that demonstrate that state-of-the-art performance on information-extraction tasks is obtainable with surprisingly little computational effort.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sri International: Description of the Fastus System Used for Muc-3

FASTUS is a (slightly permuted) acronym for Finite State Automaton Text Understanding System. It is a system for extracting information from free text in English, and potentially other languages as well, for entry into a database, and potentially for other applications. It works essentially as a cascaded, nondeterministic finite state automaton. It is an information extraction system, rather th...

متن کامل

SRI International: description of the FASTUS system used for MUC-4

FASTUS is a (slightly permuted) acronym for Finite State Automaton Text Understanding System . It is a system for extracting information from free text in English, and potentially other languages as well, for entry into a database, and potentially for other applications . It works essentially as a cascaded , nondeterministic finite state automaton . It is an information extraction system, rathe...

متن کامل

SRI : Description of the JV - FASTUS System Used for MUC - 5 Douglas

INTRODUCTION AND BACKGROUND SRI International developed an information extraction system called FASTUS 1 , a permuted acronym standing for \Finite State Automata-based Text Understanding System. The choice of acronym is somewhat misleading, however, because FASTUS is a system for information extraction, not text understanding. The former problem is much simpler and more tractable, characterized...

متن کامل

The SRI TIPSTER II Project

SRI participated in the Architecture Working Group (AWG) meetings and aided in the design, testing, and implementation of the TIPSTER document manager architecture. Their contributions concerned input on the nature of basic entities, such as documents and text segments, and ways of communicating information from extraction modules to other modules in order to allow extraction and detection modu...

متن کامل

SRI: description of the JV-FASTUS system used for MUC-5

INTRODUCTION AND BACKGROUND SRI International developed an information extraction system called FASTUS1 , a permuted acronym standing for "Finite State Automata-based Text Understanding System. The choice of acronym is somewhat misleading, however, because FASTUS is a system for information extraction, not text understanding. The former problem is much simpler and more tractable, characterized ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1993